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ABSTRACT 


The  pushback  estimates  are  defined  by  a  prel- 
ininary  data  modification  followed  by  the  applica¬ 
tion  of  a  robust  statistic  to  the  modified  data. 

The  form  of  the  pushback  estimate  depends  upon  the 
choice  of  a  scale  estimate  for  the  original  data, 
a  set  of  central  values  of  order  statistics,  and  a 
constant  multiplier.  Particular  forms  of  the 
pushback  estimates  are  both  simple  and  perform  rather 
well  relative  to  a  good  estimate  in  the  w-estimate 
or  M-estimate  category. 


Accession  Tot 

~STIS  CRAfcl 

a 

DTIC  TAB 

r: 

Unannounced 

n 

Justification - 

By  - 

Distribution/ 
Availability  Codes 


;Ava.'.  1  and/or 
in  st  \  rspecif-.i 


1. 


Introduction . 


We  have  some  quite  good  estimates  of  location  in  the 
w-estimate  and  M-estimate  category.  The  pushback  estimates 
are  another  category  of  estimates  --  one  that  appears  very 
different  —  that  satisfies  two  requirements.  First,  the 
pushback  estimates  of  a  certain  form  perform  well  over  a 
wide  range  of  distributions.  Second,  the  estimates  are 
relatively  simple  to  understand  and  use. 

Monte  Carlo  studies  indicate  that  a  well-chosen  form  of 
the  pushback  achieves  a  maximin  efficiency  (relative  to  the 
w6-biweight)  of  89%  for  sample  size  20.  This  performance  is 
measured  over  the  set  of  distributions  including  the  Gaus¬ 
sian,  one  wild  Gaussian,  mixture,  slacu,  and  slash. 

2.  The  pushback. 

The  pushback  procedures  are  based  on  preliminary  data 
modification.  The  order  statistics  of  the  data  are  modified 
before  a  simple  robust  estimate  of  location  is  applied. 

More  formally  our  procedure  is  as  follows;  Suppose  we  are 
given  n  observations, 

y y 2'  yn  ' 

from  a  particular  situation  {f^:  i=l,  ...,  n)  where  the  f^ 
are  location -scale  densities.  The  situation  may  be  either 
simple  or  compound  (Bruce,  Pregibon,  Tukey  (1981)).  The 
procedure  modifies  the  order  statistics  of  the  n  observations 
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y (1 ) ,  y(2)  ,  ...»  y (n)  , 

by  subtracting  some  function  of  i,  p(i), 

y (1 ) ~p(l ) /  y(2)-p(2),  y(n)-p(n)  . 

The  form  of  p(i)  considered  is: 
p(i)  =  k*s  *a ( i) 

where  k  is  a  constant,  s  is  an  estimate  of  the  scale  of  the 
data  { y ( i ) }  and  { a ( i )  }  is  a  set  of  central  values  of  order 
statistics  from  a  suitable  unit  distribution.  We  then  apply 
T,  a  robust  estimate,  to  the  set  { y ( i ) -k • s *a  (  i)  }  to  deter¬ 
mine  a  location  estimate  for  the  distribution  of  the  { y ( i ) } . 

We  will  call  this  procedure  the  pushback  T  when  the 
estimate  T  is  applied  to  the  modified  data,  or  pushback  when 
we  have  no  particular  estimate  in  mind.  The  pushback  was 
previously  studied  by  L.  Nanni  (Nanni  (1979)). 

3.  Simulation  Cases. 

Various  forms  of 

♦  the  location  estimate 

♦  the  central  order-statistic  values 


«  the  scale  estimate 
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The  subscript  j  denotes  the  number  of  the  iteration.  Itera¬ 
tion  to  tolerance  is  stopped  when  I T  j ~ T ^ I  <  .0005  or  when 
j  =  20,  whichever  occurs  first. 


Central  values  of  order  statistics  for  a  sample  of  size 
n  from  a  Gaussian,  logistic,  and  a  cosine-bell  distribution, 
each  with  mean  zero  and  variance  one,  were  used.  For  the 
Gaussian  and  logistic  distributions,  expected  values  of 
order  statistics  were  chosen  as  the  central  value  ( (Owen 
(1962))  and  (Birnbaum  and  Dudman  (1963)).  For  the  cosine- 
bell  density. 


1 

2 


cos(y- 


0 


otherwise 


.  1  .1 

-1  1-3  -1  1-3 

the  values  F  ( — r-)  were  used.  F  ( — j-)  has  been  shown 

n+y  n+-j 

to  be  a  good  approximation  to  the  median  of  y(i).  Letting 

a(i)_  ,a(i),  ,  and  a(i)  ,  denote  the  central  value  of  the 

Gau'  log'  cob 

ifc^  order  statistic  from  a  Gaussian,  logistic,  and  cosine- 
bell  distribution,  respectively,  and  noting  that  a(i)  = 
-a(n+i-l)  because  of  symmetry,  we  see  in  table  1  that 


order  statistics;  cosine-bell  values  are  F  ( ( i-1 /3 ) / (n+l/3 ) ) 


Ia(i)caul  <  la<1)cobl 
la<1,GaUl  >  la(i)logl 
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i-2. 


•  •  •  / 


19 


la(i)Gaul  >  la(i)cobl  i=1'20 
la(i)Gaul  <  la(i)logl* 

The  differences  between  the  Gaussian  { a (  i )  >  and  the  two 
other  sets  of  {a(i)}  can  be  seen  in  the  plot  of  log  ( j ) 
against  log  aGau<i)  (figure  1)  for  Q={log,cob}. 


Many  scale  estimates  were  tested.  These  included  the 
median  jump  ratio,  the  median  absolute  deviation,  and  vari¬ 
ous  other  percentage  points  of  { I y ( i ) -med { y . } I } .  The  median 
jump  ratio  ( MJR )  is  defined  as 


med 
i  =  l , . .  . 


when  n  is  even,  as  is  the  case  for  the  simulations  discussed 
here.  The  MJR  used  the  same  { a ( i )  }  as  are  used  in  pushing 
back  the  data  and  is  thus  matched  to  the  specific  pushback 
form.  The  median  absolute  deviation  from  the  median  (MAD) 
is  defined  as  med { I y ( i) -med { y ( i ) } | } ,  i.e.,  the  50%  point  of 
{ I y ( i) -med{ y ( i) } I } .  Other  percent  points  of  { I y ( i )  - 
med(y(i))|}  were  used.  The  complete  set  of  lower  percentage 
points  was  P®  37.5,45,50,55,70,75,80,85,  and  90.  For  exam¬ 
ple,  the  45%  point  is  z(9)  where  {z(i)}  are  the  ordered  set 
{  I  y  ( i )  —med  {y(i)  }l  }.  We  will  refer  to  P%  point  of  {  I  y  (  i )  — 
med { y ( i ) } I }  as  P%AD. 


-boll  cTnd  ]o<iir,t.  ic 
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In  order  to  evaluate  the  performance  of  the  pushback  in 
its  various  forms,  data  were  simulated  from  six  situations. 

The  situations  and  their  densities  for  simple  and  compound 
situations  are: 

Gaussian  (Gaus)  — - — exp{-(y)^/2}  *  G(0,1) 

\  I  2  IT 

slash  (slash)  =  ratio  of  an  independent  Gaussian 

to  unif (0 , 1 ) 

— ~r~5  U-exp{~y2}] 

\l2wy4*  2 

1 

2\|2? 

Cauchy  (Cauchy)  - 

tr  ( 1  +y  ; 

one  wild  Gaussian,  scale  10  (OWG)  (n-l)G(0,l)  +  1G  (0,100) 
mixture,  scale  10  (mix)  .95  G  (0 , 1 ) +  .  05G  (0  r  100 ) 
slacu  (slacu)  =  ratio  of  independent  Gaussian  to 

(unif (0,1) ) 1/3 


for  y  ^  0 

for  y  =  0 


'  y^d-expf-y2^}  - 

The  six  distributions  above  include  the  -three  corners-,  the 
Gaussian,  slash,  and  one-wild  Gaussian.  Each  of  these  three 
represents  one  extreme  of  a  data  type  we  regard  as  likely  to 
be  encountered  in  practice.  Gaussian  data  is  considered  to 
be  extremely  "nice-  data  and  is  the  norm  against  which  we 
kudge  data  from  other  distributions,  for  example,  saying 
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data  is  from  a  distribution  with  "heavier  tails"  than  the 
Gaussian.  One-wild  Gaussian  (scale  10)  data  has  one  obser¬ 
vation  from  a  Gaussian  distribution  with  mean  zero  and  vari¬ 
ance  100  and  the  rest  from  a  standard  Gaussian  and  so,  is 
very  likely  to  have  one  outlier.  The  slash  density  behaves 
as  the  Gaussian  does  in  the  middle  and  as  the  Cauchy  does  in 
the  tails.  The  first  part  of  this  statement  can  be  seen  by 
looking  at 

1  im  — - — -(l-exp(~4y2)  ) 
y!  *0  \  I  2iry^  z 

which  is,  ignoring  multiplicative  constants,  exp(— ^-y  )  ,  the 
form  of  the  Gaussian  density.  Similarly,  the  second  part  of 
the  statement  is  seen  by  looking  at 

I  1  1  2  I 

lim  and  1  im  | —  —  ( l -exp ( y  ))|  . 

yi  -co  y  T  - 03  |\|2iry^  | 

_2 

This  shows  that  the  slash  tail-density  decreases  as  y  ,  as 
does  the  Cauchy.  Thus  the  slash  satisfies  the  empirical 
observation  that  data  is  usually  Gaussian  in  the  middle, 
while  having  much  longer  tails  than  the  Gaussian.  The  three 
corners  are  good  quantitative  standards  against  which  to 
seek  good  performance. 

The  remaining  three  densities  have  been  included  to 
gain  further  information.  The  Cauchy  is  included  in  order 
to  understand  how  the  pushback  works  on  data  from  a  long¬ 
tailed  but  peaked  distribution.  We  expect,  however,  to 
encounter  Cauchy-like  data  rarely  in  practice,  and,  thus,  do 
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not  seek  good  performance  against  the  Cauchy.  The  mixture 

at  scale  10,  because  it  is  a  milder  form  of  the  one-wild 

Gaussian  at  scale  10,  gives  information  on  the  transition 

from  the  Gaussian  to  the  OWG.  The  tail-density  of  the  slacu 
-4 

decreases  as  t  in  comparison  to  the  slash  tail-density 

-2 

which  decreases  as  t  .  The  slacu  tail-density  is  like  a  t^ 
tail-density.  (Note  that  t^  tends  to  behave  like  15G(0,1)  + 
5G(0,9)  (F.  Hampel  (1979))).  These  facts  indicate  that  the 

slacu  should  give  us  information  on  the  transition  from 
G  (0,1)  to  slash  and,  in  a  less  clear  way,  on  the  transition 
from  G(0,1)  to  a  mixture-like  density. 

Cases  for  which  the  simulations  were  performed  are 
listed  in  table  2.  All  are  for  sample  size  twenty.  Vari¬ 
ances  were  calculated  using  a  swindle  (G.  Simon  (1975))  and 
are  based  on  500  samples.  The  computations  discussed  here 
were  done  on  the  Statistics  Department  PDP11/40.  Gaussian 
samples  were  generated  via  an  algorithm  due  to  Forsythe  and 
Ahrens-Dieter  (J.  H.  Ahrens  and  U.  Dieter  (1974)).  Uniform 
random  numbers  were  generated  using  Knuth's  algorithm  M 
(Knuth  (1969) ) . 


■ 


rr  D"»0 
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Table  2 


Simulation  Cases 

mjr-gaus-g;  c( .8,1.0,1.2);;sl(.4,.8,1.0,1.2) 

m(.4,.8,1.0,1.2,2.0);w(.4,.8,1.0,1.2) 
mjr-loq-q; c; si ;m;w ( .8, 1< 0,1.2) 

mad -gaus-g; c ; si ;m;w(. 8, 1.0, 1.2), ((.4, .8, 1.2, 2.0)/. 67  45) 
mad-log-  "  " 

mjr-gaus*-g;c;sl;m;w( .4,. 8, 1.0,1. 2, 2.0) 
m  mjr-gaus**-  "  " 

e  mjr-log*-  "  " 

d  mjr-cob-g;c;sl;m;w;su( .4, .8, 1.0) 
i  mjr-gaus***-  "  " 

a  37 . 5%AD-gaus-g; c ; si ; m; w; su ( . 4 , . 8 , 1 . 0 , 1 . 2 ) 
n  45%AD-gaus-  "  " 

55%AD-gaus-  "  n 

70%AD-gaus-  "  " 

75%AD-gaus-  "  " 

80%AD-gaus-  M  " 

85%AD-gaus- 

90%AD-gaus-  "  " 


mjr-gaus-g; c; si; m;w;su ( .4, .8, 1.0, 1.2, 2.0) 
mjr-log-  "  " 

mjr-gaus*-  "  " 

w  mjr-gaus**-  "  " 

6  mjr-log*-  "  " 

'  mad-gaus-g; c; si; m;w((. 4, .8, 1.0, 1.2, 2.0)/. 6745) 
b  mad-log-  "  " 

i  mj  r-cob-g; c; si; m; w; su ( . 4, . 8, 1. 0 ) 
w  mjr-gaus***-  "  " 

e  37 . 5%AD-gaus-g; c;sl;m;w;su(.4,.8,1.0,1.2) 
i  45%AD-gaus-  "  " 

g  55%AD-gaus-  "  " 

h  70%Ad-gaus-  "  " 

t  75%AD-gaus-  "  " 

80%AD-gaus-  "  " 

85%AD-gaus-  "  " 

90  %AD-gaus- 


9  mj  r-gaus-g (,4,.8,1.0,1.2);c(.8,1.0,1.2);sl(.4,.8,1.0,1.2); 
'  m;w(.8,1.0,1.2);su(.4,.8,1.0,1.2,2.0) 

b  mjr-log-g; c; sl;m;w( .8, 1. 0, 1. 2) ;su( . 4, .8, 1. 0, 1. 2, 2.  0) 
i  mad-gaus-g; c; si; m;w( .8, 1.0, 1.2) 
w  mad-log-  "  " 

e  mjr-gaus*-g; c; sl;m; w;su( .4, . 8, 1. 0, 1. 2, 2.  C) 
i  mjr-gaus**-  "  " 

mjr-log*-  "  " 


*1  smooth 

**2  smooths 

***pushback  on  inner  18  only 
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4.  Discussion  of  the  Simulation  Results 

The  results  of  the  simulation  cases  listed  in  table  2 
are  given  in  figures  2-9  and  in  tables  3  and  4.  For  exam¬ 
ple,  the  results  of  the  simulations  for  the  first  line  in 
table  2  are  given  in  figure  2.  The  scale  estimate  is  MJR , 
the  { a (  i ) }  are  the  central  order-statistic  values  from  a 
Gaussian,  the  location  estimate  is  the  median,  and  the  simu¬ 
lations  were  done  for  various  k-values  for  Gaussian,  Cauchy, 
slash,  mix,  and  OWG  data  distributions.  For  ease  of  nota¬ 
tion,  we  will  refer  to  this  pushback  form  as  the  MJR-Gaus- 
pushback  median.  In  general,  we  refer  to  a  scale  estimate  - 
{ a ( i ) }  distribution-pushback  location  estimate,  the  k-value 
and  data-distr ibution  being  specified  when  necessary. 

In  figure  2,  the  first  number  listed  at  each  k-value, 
data-distr ibution  combination  is  the  variance  of  the  MJR- 
Gaus-pushback-med ian .  The  variance  of  this  variance  esti¬ 
mate  is  shown  in  parentheses  here  as  it  is  in  the  similar 
figures  which  follow.  The  data-d istr ibution  is  represented 
by  the  angle  of-  the  arc  on  the  set  of  concentric  circles  and 
the  k  value  is  indicated  by  the  circle  radius.  The  inner 
circle  contains  values  for  a  non-pushback  estimate.  For 
example,  .0731  is  the  estimated  variance  of  the  median  when 
the  data  is  Gaussian. 

Figure  2  also  gives  the  MJR-log istic-pushback  median 

These  values  are  the  second  entry  in  the  figure. 


variances 
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Figure  2;  Variance  anri  variance  of  variance  x  in 
(in  parentheses)  of  MJR-Ga us-pushback 
median  and  MJR-log-pushback  median 


14 


Similarly,  the  MJR-Gaus-  and  MJR-log ist ic-pushback  w6- 
biweight  and  9-biweight  variances  are  given  in  figures  3  and 
4,  respectively. 

Table  5  shows  the  optimum  k-value  (the  k-value  with 
minimum  variance)  and  the  corresponding  optimum  variance  for 
a  given  scale  estimate,  { a ( i ) >  distribution,  location  esti¬ 
mate  and  data  distribution.  From  table  5  and  figures  2-4, 
we  see  the  following: 

«  First,  for  MJR  and  each  of  the  median,  w6-biweight, 
and  9-biweight  the  variance  at  the  optimum  k-value  for 
the  Gaussian  is  less  than  that  at  the  optimum  k-value 
for  the  logistic  { a ( i ) } . 

*  Second,  for  each  of  the  location  estimates,  the  simu¬ 
lations  indicate  that  the  optimum  k-values  for  a  given 
data-d istr ibut ion  for  the  the  logistic  and  Gaussian 

{ a ( i ) >  versions  of  the  pushback  with  s  =  MJR  are  the 
same . 

*  Also,  looking  at  lines  1,  15,  and  29  of  table  5,  we 

see  that  the  median  and  w6-biweight  have  less-changing 

the 

optimum  k  values  across  distributions  than/9-biweight. 

Approximately  equality  of  k-values  for  the  different 
distributions  is  of  importance  in  its  effect  on  the  compari¬ 
son  of  pushback  and  a  known  optimal  estimate.  We  see  this 
comparison  in  plots  of  the  relative  efficiency  of  pushback 
to  a  known  good  estimate,  say  w6-biweight,  against  the  log 
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of  the  pushback  constant,  log  k.  Consider 

E  *  max{min{rel  e f f } }  , 
k  Q 

where  Q  is  the  set  of  distributions  specified,  as  a  measure 
of  the  performance  of  the  pushback.  Approximately  equal 
optimum  k-values  across  distributions  will  help  to  keep  E 
high  by  preventing  the  relative  efficiency  plot  for  one  or 
more  distributions  from  plunging  while  the  other  plots 
remain  high. 

Clearly,  investigation  into  the  effect  of  a  particular 
scale  estimate  and  { a ( i )  }  distribution  is  necessary  for  tun¬ 
ing  and  understanding  of  the  procedure.  We  first  consider 
alternative  scale  estimates.  Figure  5  shows  the  MAD-Gaus- 
pushback  median  variances  and  the  MAD-log istic-pushback 
median  variances,  in  that  order.  (Some  of  the  k  values 
shown  here  are  approximate.  Each  of  .4,  .8,  1.0,  1.2,  and 
2.0  is  divided  by  .6745  and  the  rounded  values  .6,  1.2,  1.5, 
1.8,  and  3.0  are  shown  in  figure  5.)  Figures  6  and  7  show 
similar  results  for  w6-biweight  and  9-biweight,  with  figure 
6  again  showing  approximate  k-values.  From  these  values  and 
those  in  table  5,  we  see  again  that  the  variances  when  Gaus¬ 
sian  { a ( i )  }  are  used  are  better  than  or  very  close  to  the 
variances  when  logistic  { a ( i ) }  are  used. 

Given  the  results  discussed  above,  that  when  s  =  MAD  or 
MJR,  and  for  the  median  and  w6-biweight,  Gaussxci.  { a  (  i )  } 
yield  procedures  which  generally  have  smaller  variances  than 
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Figure  5 :  Variance  and  variance  of  variance  x  10 
(in  parentheses)  of  VAD-Gaus-pushback 
median  and  MAD-log-pushback  median* 


*k-values  are  approximate;  see  note  in  text. 
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Figure  7:  Variance  and  variance  of  variance  x  in 
(in  parentheses)  of  WAD-G a  us- pushback 
9-biweight  and  MAD-log-pushback  9-biweight 


21 


those  in  which  logistic  { a ( i ) }  are  used,  we  consider  a  third 
form  of  { a (  i )  > ,  cosine-bell  { a {  i )  } .  Figures  8  and  9  give 
results  for  the  MJR-cob-pushback  median  and  w6-biweight. 
Table  5  shows  smaller  variances  for  cosine-bell  { a (  i )  }  than 
for  Gaussian  { a { i )  >  at  optimum  k-values,  but  that  the 
optimum  k-values  for  the  cosine-bell  {a(i)}  are  more  spread 
than  those  for  Gaussian  { a ( i ) } .  Figures  10  and  11  show 
plots  of  the  relative  efficiency  of  the  pushback  to  w6- 
biweight  against  the  log  pushback  constant.  In  k,  for  the 
MJR-Gaus-pushback  median  (Figure  10)  and  the  MJR-cob- 
pushback  median  (Figure  11).  Relative  efficiency  is  again 
(variance  w6-biweight) /  (variance  pushback)  for  the  particu¬ 
lar  form  of  the  pushback  being  studied.  The  cosine-bell 
version  has  a  slightly  higher  maximin  relative  efficiency 
(82%  vs.  81%).  In  both  cases,  however,  the  slash  curve 
keeps  the  values  low  by  not  attaining  its  maximum  at  a  value 
of  In  k  near  the  maximizing  In  k  of  the  other  distributions. 
Since  cosine-bell  and  Gaussian  { a (  i )  }  have  very  close  per¬ 
formance,  we  will  continue  testing  with  the  more  readily 
accepted  norm,  the  Gaussian  case. 

Returning  to  our  comparison  of  scale  estimates,  we  look 
at  the  results  for  P%AD-Gaus-pushback  median  and  P%AD-Gaus- 
pushback  w6-biweight  where  P%AD  is  the  scale  estimate 
defined  as  the  value  at  the  lower  P  percent  point  of  the 
absolute  deviations  from  the  median,  {  I  y  (  i) -med  {  y  (  i)  }  I  } 

The  P  values  used  were  37.5,  45,  50  (i.e.,  the  MAD),  55,  70, 
75,  80,  85,  and  90.  The  variances  for  these  pushback  forms 


Etliclan 


-  25  - 


Log  Pushback  Constant,  In  k 


Figure  11:  Efficiency  (relative  to  wf>-biv;e  ight ) 
of  NiJH-cob-pushb?ck  median  vs.  In  k 
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are  shown  in  tables  3  and  4  for  the  P%AD-Gaus-pushback 
median  and  P%AD-Gaus-pushback  w6-biweight,  respectively. 
Figures  12-20  show  relative  efficiency  plots  for  the  P%AD- 
Gaus-pushback  median  against  In  k,  where  relative  efficiency 
is  again  defined  with  respect  to  the  w6-biweight.  The  fol¬ 
lowing  table  shows  the  values  of  max{min(rel  eff) }  for  the 

k  Q 

various  values  of  P  used  in  P%AD. 

P  used  in  P%AD 

as  scale  estimate  max{min(rel  eff)} 


37.5 

80 

45 

85 

50 

88 

55 

89 

70 

83 

75 

83 

80 

72 

85 

54 

90 

_ 

For  each  pushback  listed  in  the  table,  Gaussian  (a(i)}  were 
used.  We  see  that  the  pushback  with  s  =  55%AD  performs  well 
in  comparison  to  w6-biweight,  achieving  89%  or  higher  rela¬ 
tive  efficiency  for  all  distributions  considered. 
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Figure  13  : 
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Figure  14  :  Efficiency  (relative  to  w6-biweight) 
of  MAD-Gaus-pushback  median  vs.  In  k 
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5.  Conclusions  from  the  Simulations 

The  results  described  in  section  4  and  the  requirement 
of  simplicity  of  the  pushback  procedure  provide  preliminary 
answers  to  the  following  question: 

What  choices  of 

location  estimate, 

<fe  scale  estimate,  and 

*  { a (  i )  }  distrubution 

yield  good  performance  for  the  associated  pushback  pro¬ 
cedure  while  still  maintaining  its  simplicity? 

Gaussian  {a(i)}  were  shown  to  perform  better  than 
logistic  { a ( i ) }  for  s  =  MJR,  and  MAD  with  T  =  median,  and 
w6-biweight.  The  Gaussian  central  order-statistic  values 
also  were  shown  to  have  performance  very  close  to  that  of 
cob  { a ( i ) }  for  s  =  MJR  with  T  =median.  We  choose  Gaussian 
{ a ( i ) }  as  the  central  order-statistic  values  because  of 
their  performance  and  the  wide  acceptance  of  Gaussianity  as 
the  norm  against  which  we  judge  other  distributions. 

The  extensive  simulations  were  done  for  T  =  median 
rather  than  T  =  w6-biweight  or  9-biweight  because  of  a 
preference  for  simplicity  in  procedure.  Also,  since  the 
w6-biweight  exhibits  good  performance,  it  is  unlikely  that 
modifications  of  it  will  have  marked  decreases  in  variance 
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( see  table  5  )  . 

The  scale  estimate  s  =  55%AD,  when  used  with  T  =  median 
and  Gaussian  { a  { i )  }  achieves  >_  89%  of  the  efficiency  of  the 
w6-biweight  (Figure  15).  This  relative  efficiency  is  higher 
than  those  for  the  other  scale  choices  tested  when  used  with 
Gaussian  { a ( i )  }  and  the  median  as  the  location  estimate. 

What  the  conclusions  indicate  then  is,  first,  that 
Gaussian  { a ( i )  }  are  a  good  form  of  central  order-statistic 
values.  This  is  because  most  data  are  Gaussian  in  the 
center.  This  choice  makes  it  possible  to  find  a  line  of  the 
form  z(i)  =  k«s*a(i),  where  s  =  P%AD,  which  is  parallel  to 
the  central  P%  of  the  data.  The  value  of  P  will  depend  on 
the  sampling  situation.  Second,  we  see  that  a  good  choice 
of  scale  estimate  is  the  P%AD  where  P=55.  This  choice  is  a 
maxi-min  choice  since  our  criteria  for  performance  is  that 
the  five  distributions,  Gaussian,  OWG,  mix,  slash,  and 
slacu,  all  exhibit  good  performance  at  the  same  value  of  k 
using  this  choice  of  s.  The  residuals  of  the  data  from  the 
line  specified  by  these  choices  of  s  and  { a ( i )  }  are  then 
well-enough  behaved  for  each  of  the  five  distributions  that 
the  media  applied  to  these  residuals,  the  pushback  median, 
achieves  good  performance  relative  to  the  w6-biweight. 

The  simulation  conclusions  discussed  thus  far  are  based 
on  overall  descriptions  of  the  behavior  of  the  pushback  pro¬ 
cedures.  Variances  based  on  the  500  samples  of  size  20  for 


the  various  pushback  procedures  are  compared  as  are  the 
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efficiencies  relative  to  w6-biweight. 

We  would  like  to  look  at  the  behavior  of  a  specific 
pushback  for  particular  data  configurations  in  order  to  both 
understand  the  procedure  and  to  fine-tune  the  procedure.  To 
do  this  we  need  the  optimum  estimate  for  the  particular  data 
configuration.  Then  identifying  data  configurations  where 
the  pushback  performs  poorly  in  comparison  to  the  optimum 
estimate  may  indicate  the  tuning  necessary  to  improve  the 
performance  of  the  procedure.  We  would  also  like  to  deter¬ 
mine  the  minimum  attainable  variance  for  a  particular  sam¬ 
pling  situation  and  the  maximum  attainable  polyefficiency 
over  several  sampling  situations  in  order  to  determine  effi¬ 
ciencies  relative  to  this  optimum.  Configural  sampling  and 
configural  polysampling  provide  a  method  for  achieving  the 
analysis  discussed  in  this  paragraph.  The  post-conf igural 
polysampling  pushback  results  are  discussed  in  (Krystinik 
(1981))  . 
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